Structure-Preserving Pipelines for Digital Libraries
نویسندگان
چکیده
Most existing HLT pipelines assume the input is pure text or, at most, HTML and either ignore (logical) document structure or remove it. We argue that identifying the structure of documents is essential in digital library and other types of applications, and show that it is relatively straightforward to extend existing pipelines to achieve ones in which the structure of a document is preserved.
منابع مشابه
شاخص های طراحی و ارزیابی کتابخانه های دیجیتالی
Introduction: There was always suspicion regarding concept and frameworks of digital libraries concepts such as electronic library, virtual library, without wall library, hybrid library and digital library have applied often together, or for each other for conveying library concept. Studies have shown that so far there is no standard and universal accepted definition for digital libraries, howe...
متن کاملProposed content framework for digital literacy education to users in Iran
Aim: today, digital literacy, as a set of skills that enable people to use digital space effectively for success in personal, educational and professional life, has become a necessity in all societies and public libraries are one of the most important providers of digital literacy education in the world. Digital literacy education has not been considered in public libraries in Iran. The first s...
متن کاملA Systematic Review of Data Mining Applications in Digital Libraries
Purpose: Study aimed to identify the applications of data mining in the provision of services, collection and management of digital libraries. Methodology: This is an applied study in terms of purpose and in terms of method is qualitative research that have been done by systematic review method. For this purpose, articles have been obtained by searching databases of Springer, Emerald, ProQuest,...
متن کاملCritical Success Factors of Digital Libraries in Iran: A Qualitative Research
Background and Aim: Myriad of IT projects failed in recent years. Digital libraries (DLs) as the product of the usage of IT in the library organization followed a similar trend. This paper studies the critical success factors (CSFs) of DLs in the context of Iran, with special focus on the Iranian Ministry of Science, Research, and Technology. CSFs, in this paper, are those factors that if follo...
متن کاملبررسی میزان رعایت معیارهای ارزیابی رابط کاربر در صفحات وب فارسی کتابخانههای دیجیتالی خودساخته و خریداری شده در ایران
Purpose: Concerning digital libraries, interaction between user and system is among major issues for using library software. Therefore, finding appropriate software for this purpose is of high importance. This study aims to evaluate and analyze the criteria related to user interface in Farsi web pages of self-made and purchased digital libraries in Iran. Methodology: This is an applied and eva...
متن کامل